Comparing the Performance of Data Mining Tools: WEKA and DTREG
نویسندگان
چکیده
The objective of the paper is to compare two data mining tools on the basis of various estimation criteria. The data mining tools which are evaluated are WEKA and DTREG. These tools are used to build multilayer perceptron which is a data mining model to predict the survivability of the oral cancer patients. Oral cancer database is considered as it is estimated to be 8th most common cancer worldwide and extremely grave problem in India as well. Early detection is the only way to prevent the disease and reduce this burden. Dtreg is a proprietory data mining tool whereas weka is an open source. Classification accuracy of multilayer perceptron model developed using dtreg is 70.05% and using weka is 59.70%. 10-fold cross-validation method is used for validation by dtreg and stratified cross validation is used by weka. The data mining tool dtreg has demonstrated better results in terms of true negative, false negative, specificity, recall and area under ROC curve. However, weka displays better results in terms of true positive, false positive, precision and f-measure. Analysis run time of dtreg is less than weka and the report generated by dtreg is also more expressive and descriptive in comparison to weka, which makes dtreg a better data mining tool for multilayer perceptron models.
منابع مشابه
Performance Improvement of Data Mining in Weka through GPU Acceleration
Data mining tools may be computationally demanding, so there is an increasing interest on parallel computing strategies to improve their performance. The popularization of Graphics Processing Units (GPUs) increased the computing power of current desktop computers, but desktop-based data mining tools do not usually take full advantage of these architectures. This paper exploits an approach to im...
متن کاملA Comparison of Data Mining Tools using the implementation of C4.5 Algorithm
This paper presents the implementation on a healthcare dataset using data mining tools to find important parameters that reflect the effect of diabetes on kidney of patients. This is done with the use of Kidney Function Tests (KFT). The data mining tools used are Tanagra and Weka with the application of C4.5 Algorithm which is based on decision trees. This paper compares the result given by Wek...
متن کاملGEORGY MINAEV Comparing GUHA and Weka Methods in Data Mining
TAMPERE UNIVERSITY OF TECHNOLOGY Master's Degree Programme in Information Technology MINAEV, GEORGY: Comparing GUHA and Weka Methods in Data Mining Master of Science Thesis, 67 pages, 0 Appendix pages 07 November 2012 Major: Mathematics Examiner: Associate Professor Esko Turunen and Professor Ari Visa
متن کاملPerformance Analysis of Engineering Students for Recruitment Using Classification Data Mining Techniques
-Data Mining is a powerful tool for academic intervention. Mining in education environment is called Educational Data Mining. Educational Data Mining is concerned with developing new methods to discover knowledge from educational database and can used for decision making in educational system. In our work, we collected the student’s data from engineering institute that have different informatio...
متن کاملPredicting Bankruptcy of Companies using Data Mining Models and Comparing the Results with Z Altman Model
One of the issues helping make investment decisions is appropriate tools and models to evaluate financial situation 0f the organization. By means of these tools, investors can analyze financial situation of the organization and identify financial distress or an ideal condition, they become aware of making decisions to invest in appropriate conditions. The main objective of this study is to ev...
متن کامل